
In this study, ubiquitylation and SUMOylation, the most widely studied ubiquitin family conjugations were investigated. Form a home-built database of PLMD, we totally collected 121,742 and 8,115 lysine ubiquitylation and SUMOylation sites. By integrating these two data sets, 3,363 UBS crosstalk sites in 1,302 proteins were detected. In our system, a local sliding window size of a maximum number of 31 residues flanking each lysine was chosen. We used multiply modified sites as the positive data, but it would be difficult to prove definitively that a lysine residue will not be occurred crosstalk under any conditions. To solve this problem of choosing the negative sets, the singly modified sites were used as the negative data. Then we can train a model to predict whether experimental ubiquitination sites can also be regulated by SUMOylation, or vice versa. And such a procedure can efficiently exclude potentially non-functional UBS crosstalk mainly generated by chance. Eventually, the positive sets were composed of 3,363 samples, whereas the negative sets included 123,131 samples.